Will person detection help bag-of-features action recognition?

نویسندگان

  • Alexander Klaser
  • Marcin Marszalek
  • Ivan Laptev
  • Cordelia Schmid
چکیده

Bag-of-feature (BoF) models currently achieve state-of-the-art performance for action recognition. While such models do not explicitly account for people in video, person localization combined with BoF is expected to give further improvement for action recognition. The purpose of this paper is to validate this assumption and to quantify the improvements in action recognition expected from current and future person detectors. Given locations of people in video, we nd that somewhat surprisingly background suppression leads only to a limited gain in performance. This holds for actions in both simple and complex scenes. On the other hand, we show how spatial locations of people enable to incorporate strong geometrical constraints in BoF models and in this way to improve the accuracy of action recognition in some cases. Our conclusions are validated with extensive experiments on three datasets with varying complexity, basic KTH, realistic UCF Sports and challenging Hollywood. Key-words: computer vision, action recognition, video, bag-of-features, human detection, tracking, classi cation, local descriptors ∗ INRIA Grenoble, LEAR, LJK {klaser,schmid}@inrialpes.fr † Visual Geometry Group, University of Oxford, [email protected] ‡ INRIA / Ecole Normale Supérieure, Paris [email protected] La détection de personnes, peut-elle aider la reconnaissance d'actions Résumé : Pas de résumé Mots-clés : vision par ordinateur, reconnaissance d'actions, vidéo, sac-demots, détection de personnes, tracking, classi cation, descripteurs locals Will person detection help bag-of-features action recognition? 3

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving bag-of-features action recognition with non-local cues

Local space-time features have recently shown promising results within Bag-of-Features (BoF) approach to action recognition in video. Pure local features and descriptors, however, provide only limited discriminative power implying ambiguity among features and sub-optimal classification performance. In this work, we propose to disambiguate local space-time features and to improve action recognit...

متن کامل

Trajectory aligned features for first person action recognition

Egocentric videos are characterised by their ability to have the first person view. With the popularity of Google Glass and GoPro, use of egocentric videos is on the rise. Recognizing action of the wearer from egocentric videos is an important problem. Unstructured movement of the camera due to natural head motion of the wearer causes sharp changes in the visual field of the egocentric camera c...

متن کامل

Application of Combined Local Object Based Features and Cluster Fusion for the Behaviors Recognition and Detection of Abnormal Behaviors

In this paper, we propose a novel framework for behaviors recognition and detection of certain types of abnormal behaviors, capable of achieving high detection rates on a variety of real-life scenes. The new proposed approach here is a combination of the location based methods and the object based ones. First, a novel approach is formulated to use optical flow and binary motion video as the loc...

متن کامل

Learning person-object interactions for action recognition in still images

We investigate a discriminatively trained model of person-object interactions for recognizing common human actions in still images. We build on the locally order-less spatial pyramid bag-of-features model, which was shown to perform extremely well on a range of object, scene and human action recognition tasks. We introduce three principal contributions. First, we replace the standard quantized ...

متن کامل

Action Change Detection in Video Based on HOG

Background and Objectives: Action recognition, as the processes of labeling an unknown action of a query video, is a challenging problem, due to the event complexity, variations in imaging conditions, and intra- and inter-individual action-variability. A number of solutions proposed to solve action recognition problem. Many of these frameworks suppose that each video sequence includes only one ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010